Best Bases Bayesian Hierarchical Classifier for Hyperspectral Data Analysis

نویسندگان

  • Joseph T. Morgan
  • Alex Henneguelle
  • Melba M. Crawford
  • Joydeep Ghosh
  • Amy Neuenschwander
چکیده

Classification of hyperspectral data is challenging because of high dimensionality inputs coupled with possible high dimensional outputs and scarcity of labeled information. Previously, a multiclassifier system was formulated in a binary hierarchical framework to group classes for accurate, rapid discrimination. In order to improve performance for small sample sizes, a new approach was developed that utilizes a feature reduction scheme which adaptively adjusts to the amount of labeled data available, while exploiting the fact that certain adjacent hyperspectral bands are highly correlated. The resulting best-basis binary hierarchical classifier (BB-BHC) family is thus able to address the “small sample size” problem, as evidenced by experimental results obtained from analysis of AVIRIS and Hyperion data acquired over Kennedy Space Center. INTRODUCTION The increasing availability of data from hyperspectral sensors provides the capability to characterize the spectral response of targets with greater detail than multispectral sensors, and thereby can potentially improve discrimination between targets. However, the high dimensionality of the data is problematic for supervised statistical classification techniques that utilize the estimated covariance matrix since the number of known samples is typically small relative to the dimension of the data [1]. Previous research has dealt with this problem using a) regularization methods to stabilize the estimated covariance matrix directly or by using the pseudo-inverse [2,3], b) transformation of the input space via reduction in the dimension of the feature space via feature extraction or selection [4,5] or addition of artificially labeled data [6,7], and c) utilization of ensembles of classifiers (e.g. bagging, simple random sub-sampling, arcing) [8,9]. When sample sizes are very small, these approaches are inadequate. Regularized covariance matrices often produce biased estimates; the pseudo-inverse approach does not perform uniformly well over a range of sample sizes; feature extraction methods suffer from interpretability of results; and the performance of ensembles of classifiers is greatly degraded when sample sizes are extremely small. [10]. BEST-BASES BAYESIAN HIERARCHICAL CLASSIFIER A new approach has been developed specifically to address the problem of extremely small sample sizes. It is based on a Binary Hierarchical Classifier (BHC) framework that creates a multiclassifier system with C-1 classifiers arranged as a binary tree [11]. In the top-down implementation, the root classifier tries to optimally partition the original set of classes into two disjoint meta-classes while simultaneously determining the Fisher discriminant that separates these two subsets. The procedure is recursed, i.e., the meta-class n Ω at node n is partitioned into two meta-classes ( ) 2 2 1 , n n+ Ω Ω , until the original C classes are obtained at the leaf nodes [12]. The tree structure allows the more natural and easier discriminations to be accomplished earlier [13]. The bottom-up version of the BHC utilizes an agglomerative clustering algorithm whereby the two most “similar” meta-classes are merged until only one meta-class remains. Fisher’s discriminant is again used as the distance measure for determining the order in which the classes are merged. Both algorithms perform quite well for large dimensional input and output problems if data samples are not extremely small. The new method extends the TD-BHC and BU-BHC approaches through a best bases feature extraction technique that exploits the highly correlated bands observed within hyperspectral data when it is advantageous. Jia and Richards proposed a Segmented Principal Components Transformation (SPCT) that also exploits this characteristic [14]. However, SPCT does not guarantee good discrimination capability because the PCT transformation 0-7803-7536-X/$17.00 (C) 2002 IEEE 1434 criterion is related to variance, not discrimination between classes. Further, the SPCT is based on the correlation matrix over all classes, and thus loses information from the class conditional correlation matrices. Kumar et. al proposed band combination techniques inspired by Best Basis functions [15]. Adjacent bands were selected for merging/splitting in a bottom-up/top down approach using the product of a correlation measure and a Fisher based discrimination measure. Although the methods exploit band ordering and yield excellent discrimination, they are computationally expensive. Additionally, the quality of the discrimination functions, and thus the structure of the resulting feature space, is affected by the amount of training data. The new approach applies a best-basis band-combining algorithm in conjunction with the BHC framework, while tuning the amount of feature reduction to the quantity of available data. It also exploits the discovered hierarchy of classes to regularize covariance estimates using shrinkage. The method can be viewed as a “best-basis” BHC that performs a bandcombining stage prior to the partitioning (TD variant) or combining (BU variant) of metaclasses. Band combination is performed on highly correlated, spectrally adjacent bands. Because the intraband correlation is class specific, the band reduction algorithm must be class dependent. In order to estimate the “correlation” for a group of bands (meta-bands) [ ] : p q = B over a set of classes Ω , the correlation measure ( ) Q B is defined as ( ) min min min min k

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Feature Selection for Hyperspectral Data Analysis Using a Binary Hierarchical Classifier and Tabu Search

High dimensional inputs coupled with scarcity of labeled data are among the greatest challenges for classification of hyperspectral data. These problems are exacerbated if the number of classes is large. High dimensional output classes can often be handled effectively by decomposition into multiple two(meta)class problems, where each sub-problem is solved using a suitable binary classifier, and...

متن کامل

Random Forests of Binary Hierarchical Classifiers for Analysis of Hyperspectral Data

Statistical classification of hyperspectral data is challenging because the input space is high in dimension and correlated, but labeled information to characterize the class distributions is typically sparse. The resulting classifiers are often unstable and have poor generalization. A new approach that is based on the concept of random forests of classifiers and implemented within a multiclass...

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

Hybrid Hierarchical Classifiers for Hyperspectral Data Analysis

We propose a hybrid hierarchical classifier that solves multiclass problems in high dimensional space using a set of binary classifiers arranged as a tree in the space of classes. It incorporates good aspects of both the binary hierarchical classifier (BHC) and the margin tree algorithm, and is effective over a large range of (sample size, input dimensionality) values. Two aspects of the propos...

متن کامل

A Probabilistic Bayesian Classifier Approach for Breast Cancer Diagnosis and Prognosis

Basically, medical diagnosis problems are the most effective component of treatment policies. Recently, significant advances have been formed in medical diagnosis fields using data mining techniques. Data mining or Knowledge Discovery is searching large databases to discover patterns and evaluate the probability of next occurrences. In this paper, Bayesian Classifier is used as a Non-linear dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001